Optimising Näıve Bayesian Networks for Spam Detection
نویسنده
چکیده
In 2001, Spam e-mails accounted for 8% of all e-mails sent over the internet. By 2002, that number had risen to 36%. Despite these staggering numbers, there are only about 150 people responsible for the bulk of spam e-mail. Because of this, there are many words and phrases common to most spam e-mails that do not occur in desirable e-mail messages. A näıve Bayesian classifier can be employed to detect these spam messages, but the computing costs associated with classifying a large number of messages can be very high. To resolve this problem, two modifications are made to the classifier. The computation is simplified using unique classifiers for each message. In addition, the vocabulary used by the classifiers is reduced using an entropy-based technique. These modifications reduce the cost of the classification and reduce the error rate of the classifier.
منابع مشابه
Variable Thresholding In Naïve Bayesian Spam Filters
Email has become an essential means of communication for both business and personal use. However, the proliferation of unwanted email advertising or spam has cost organizations millions of dollars and has reduced the effectiveness of email as a communications medium. Recently, spam filters have been widely adopted as a means of combating these unwanted messages. This paper presents a method for...
متن کاملAn Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network
In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملA Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملEmail classification for Spam Detection using Word Stemming
Unsolicited emails, known as spam, are one of the fast growing and costly problems associated with the Internet today. Among the many proposed solutions, a technique using Bayesian filtering is considered as the most effective weapon against spam. Bayesian filtering works by evaluating the probability of different words appearing in legitimate and spam mails and then classifying them based on t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002